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Abstract 

We study routing and scheduling in packet-switched networks. We assume an adversary that 
controls the injection time, source, and destination for each packet injected. A set of paths for 
these packets is admissible if no link in the network is overloaded. We present the first on-line 
routing algorithm that finds a set of admissible paths whenever this is feasible. Our algorithm 
calculates a path for each packet as soon as it is injected at its source using a simple shortest 
path computation. The length of a link reflects its current congestion. We also show how our 
algorithm can be implemented under today's Internet routing paradigms. 

When the paths are known (either given by the adversary or computed as above) our goal 
is to schedule the packets along the given paths so that the packets experience small end- 
to-end delays. The best previous delay bounds for deterministic and distributed scheduling 
protocols were exponential in the path length. In this paper we present the first deterministic 
and distributed scheduling protocol that guarantees a polynomial end-to-end delay for every 
packet. 

Finally, we discuss the effects of combining routing with scheduling. We first show that some 
unstable scheduling protocols remain unstable no matter how the paths are chosen. However, the 
freedom to choose paths can make a difference. For example, we show that a ring with parallel 
links is stable for all greedy scheduling protocols if paths are chosen intelligently, whereas this 
is not the case if the adversary specifies the paths. 

1 Introduction 

Two of the most important problems in the control of packet-switched networks are routing and 
scheduling. The goal of routing is to assign a path to a packet from its source to its destination. 
The goal of scheduling is to deal with the contention that occurs when two or more packets wish 
to cross a link simultaneously. Each link must have a scheduler that resolves this contention by 
deciding which packet to advance. 

The scheduling problem typically assumes that the paths of the packets are given as part of 
the input. The goal is then to schedule the packets along their paths in such a way that they 
all reach their destinations in a short time. Much recent work has focused on the Adversarial 
Queueing Model, e.g. @, We follow their convention and assume that all packets are unit 

size and each link processes one packet per time step. In this Adversarial Queueing Model, the 

* Partially supported by DIM ACS funding. A preliminary version of this paper appeared in the Proceedings of 
the 42th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2001. 
^Bell Laboratories, andrews@research.bell-labs.com. 

"''GSyC, ESCET, Universidad Rey Juan Carlos, Spain, anto@gsyc.escet.urjc.es. 
^Department of Computer Science, University of Southern California, agoel@cs.usc.edu. 
^Bell Laboratories, ylz@research.bell-labs.com. 



adversary chooses the injection time, source, destination, and route for each packet injected. A 
sequence of injections is called (w,r)- admissible for a window size w and injection rate r < 1, if 
in any time interval of T > w the total number of packets injected into the network whose paths 
pass through any link e is at most Tr. These paths are also called (w, r)- admissible. Previous work 
has examined the performance of a number of simple scheduling protocols in this model. A packet 
scheduling protocol is said to be universally stable if it guarantees bounded buffer sizes and packet 
transmission delays for any (w, r)-admissible injections. In Q it was proved that several natural 
protocols (Longest-In-System, Shortest-In-System, Furthest-To-Go) are universally stable, whereas 
several others (First-In-First-Out, Last-In-First-Out, Nearest-To-Go) are not. 

In this paper we study both routing and scheduling. The adversary no longer specifies the route 
of each packet; it merely specifies the source and destination. However, we are guaranteed that 
(w, r)-admissible paths for the injections do exist. The problem is now two-fold. We first need to 
find some (W, i?)-admissible paths, possibly for a different window size W and a different R < 1. 
These admissible paths combined with a universally stable scheduling scheme, such as the ones 
in [g] or the one presented in Section |3| of this paper, result in a universally stable protocol for 
routing and scheduling. 



1.1 Source Routing for Stability 

Our result. In Section ^ of the paper we present the first online algorithm for assigning admissible 
routes to packets. If the adversary can assign (w, r)-admissible routes, then our algorithm finds 
a set of (W, .R)-admissible routes where R G (r, 1) is of our choice and W > w is determined by 
the choice of R. Hence, if the parameter of merit is the window size w, then our algorithm is a 
VF/tf-approximation algorithm (modulo a small increase in the rate). Moreover, our algorithm is 
online in that it assigns routes to packets as soon as they are injected into the network. Hence it can 
also be regarded as a W/ uncompetitive algorithm for this problem. This is the first approximation 
algorithm/ competitive algorithm for this problem. Once the routes are chosen, we can use any 
"good" scheduling protocol in the Adversarial Queueing Model. 

Our algorithm is based on the ^-approximation algorithm for fractional maximum multicom- 
modity concurrent flow given by Garg and Konemann [10], which in turn builds upon the work of 
Plotkin, Shmoys, and Tardos |l3|] and Young |l8| . In the maximum multicommodity concurrent 
flow problem, the demands for each commodity remain constant as the algorithm progresses. In 
our setting, the demands between source-destination pairs correspond to the packets injected by 
the adversary, which can change over time. Even though the algorithm of Garg and Konemann 
is an offline algorithm that assigns fractional paths to a fixed set of commodities, in our setting we 
are able to convert it into an online algorithm that assigns an integral path to each packet as soon 
as it is injected. 



Implementation under Internet routing paradigms. At a high level, our algorithm works 
as follows. Each link maintains a measure of congestion that represents how many packets have 
been routed through it in the recent past. Packets are then routed on shortest paths with respect 
to this congestion measure. Hence we need a mechanism for distributing congestion information 
from the links to the source nodes. We also need a mechanism by which a source node can inform 
a link whenever it routes a packet through that link. 

The first requirement could be satisfied by something akin to the OSPF (Open Shortest Path 
First) link state flooding protocol. (See e.g. flllfl .) This is a protocol that is used for flooding link 
state information to the nodes in a network so that packets may be routed along shortest paths. 
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The second requirement may be satisfied by the MPLS (Multi-Protocol Label Switching) protocol 
that is gaining increasing acceptance in the Internet. (See e.g. [15].) With this protocol a source 
node can compute an explicit route to each destination and then distribute a label for the route to 
each of the links that comprise the route. In combination with this label distribution the source 
can also specify how much traffic it is going to send on the route. 

In Section ^ we first assume that this control information is transmitted instantaneously and 
does not contribute to the congestion in the network. We then consider a model in which the 
control information is transmitted in-band through the network and must contend with the data 
traffic. 



Relation to previous work. Routing and scheduling as a combined problem has been studied 
in the past. For example, Aiello et al. presented a distributed algorithm Q motivated by the 
Awerbuch-Leighton multicommodity flow algorithm || . In || Gamarnik gave a solution based on an 
approximation algorithm for static routing. However, both these algorithms require a dependence 
between how a packet is routed and how it is scheduled. Hence, their routing schemes only work 
in association with their specific scheduling schemes, but not with generic scheduling algorithms. 
Neither routing algorithm can be used to provide packets with admissible paths at injection time. 
Using networking terminology, these routing algorithms correspond to active routing |T^| , where 
intermediate routers need to actively participate in determining routes for each individual packet. 
In contrast, our algorithm corresponds to source routing, where the entire path of a packet is known 
at the source. 



1.2 Deterministic Distributed Scheduling with Polynomial Delays 

In Section [3| of the paper we study the scheduling problem in isolation assuming that (w, in- 
admissible paths are given. In recent years, a number of scheduling algorithms have been proposed 
that guarantee network stability, i.e. the number of packets in the network remains bounded and the 
end-to-end delay experienced by packets remains bounded. For example, the Longest-In-System 
protocol that always gives priority to the packet injected into the system earliest, was shown in 
[§] to guarantee a delay bound of 0(w/(l — r) dmax ), where d max is the maximum length of a path 
assigned to any packet. Note however, that this bound is exponential in c? max . It has been an open 
problem whether or not any deterministic, distributed scheduling protocol has a polynomial delay 
bound in the Adversarial Queueing Model. Indeed, ||] remarked that "it is of considerable interest 
to determine whether such a protocol exists" . 

A randomized protocol based on Longest-In-System can guarantee that each packet experiences 
a delay of poly(w, 1/(1 — r), d max , logm) with high probability [0], where m is the number of links 
in the network. In essence, for most of the time the protocol is successful and keeps all delays 
small. However, even if the failure probability is small, if the algorithm is run for an extended 
period of time then the algorithm is likely to make some random choices that are bad. This causes 
packets to violate the delay bound. Moreover, if one packet violates the delay bound then other 
packets injected along the same path at similar times are also likely to violate the delay bound. 
Hence, all of the packets that make up a single file transfer could be excessively delayed. Although 
this randomized protocol can be derandomized in a centralized manner it seems hard to convert it 
into a deterministic, distributed protocol. This is because the "success condition" involves packets 
injected at multiple source nodes and hence it cannot be verified locally. 
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Our result. In Section || we present the first deterministic, distributed scheduling protocol with a 
polynomial delay bound. It guarantees that all packets reach their destination within poly(uu, 1/(1 — 
r),m) steps of their injection. We start by presenting a randomized protocol in which the "success 
condition" can be verified at the source nodes independently. This allows us to derandomize the 
protocol in a distributed fashion. 

1.3 The Effects of Combining Source Routing with Scheduling 

In the final part of the paper we consider the following question: Is it possible for unstable scheduling 
protocols to become stable if paths can be chosen by a routing algorithm as opposed to being 
dictated by the adversary? We first present a network and a sequence of packet injections such 
that regardless of how the routes for these packets are chosen, many greedy protocols (including 
FIFO) remain unstable. Thus, we cannot hope to achieve stability using FIFO even if we have 
the freedom to choose routes. However, we also present an example in which the ability to select 
the routes does make a difference. We show that in a "ring" with multiple parallel links, if we are 
allowed to choose the routes intelligently then we can ensure that all greedy scheduling protocols 
are stable. However, if the adversary dictates the routes then many scheduling protocols (including 
FIFO) are unstable. 

1.4 Other Related Work 

Much traditional work on routing focuses on the problem of routing flows online, e.g. [||, |J]. Each 
flow requests a bandwidth from a source to a destination and we must choose a path for each 
accepted flow without violating any link capacity. The goal is to maximize the on-line acceptance 
rate. However, this work does not consider packet-level behavior. 

The problem of choosing routes for a fixed set of packets was studied by Srinivasan and Teo fjl| 
and Bertsimas and Gamarnik ||. For example, |l6| presents an algorithm that minimizes the 
congestion and dilation of the routes up to a constant factor. This result complemented the paper 
of Leighton, Maggs and Rao |12| which showed that packets could be scheduled along a set of paths 
in time O (congestion -l-dilation). 

2 Source Routing for Stability 

For convenience we use the following weaker notion of admissibility in this section. We say that 
a set of packet paths is weakly (w,r)- admissible if we can partition time into windows of length w 
such that for each window in the partition and each link e, the number of paths that pass through 
e and correspond to packets injected during the window is at most wr. However, this distinction 
is not important due to Lemma |l[ Moreover, all of the delay bounds that have been derived in the 
past for the Adversarial Queueing Model apply to weakly (w, r)-admissible paths. 

Lemma 1 If a set of paths is (w,r)- admissible then it is also weakly (w , r)-admissible. Conversely, 
weak (w,r)- admissibility implies (w' ,r')- admissibility for some w' > w and r' £ [r, 1). 

Proof: Suppose the injections are weakly (w, r)-admissible. We show that they are (w',r')- 
admissible for r' = (1 + r)/2 and w' = 4w/(l — r). For any T > w', let T be in the range 
of [nil), (n + l)w) where n is an integer at least 4r/(l — r). Due to weak admissibility and our 
choices of n, T and r', the number of injections during T steps for any link e is at most, 

(n + 2)rw < nw(l + r)/2 < Tr'. 
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The other direction is trivial. 



We assume an adversary that injects weakly (w, r)-admissible packets into the network]]. Our 
aim is to choose weakly (W, inadmissible routes for these packets where R € (r, 1) is of our choice 
and W > w is determined by the choice of R. 

2.1 The Basic Routing Protocol 

We first assume that control information is communicated instantaneously. Whenever a source 
node chooses a route for a packet, this information is instantaneously transmitted to all the links 
on the route. Whenever the congestion on a link changes, this fact is instantaneously transmitted 
to all the source nodes. Later on we relax these assumptions. As mentioned in the Introduction, the 
algorithm is based on the Garg-Konemann offline approximation algorithm for fractional maximum 
concurrent flow. However, in our setting we can convert it into an online algorithm that chooses 
integral paths for the packets. 



Find routes. 

1 Initialize c(e) = 5, Ve 

2 for the ith. window, i = 1, . . . , t 

3 for each packet injected during ith window 

4 p <— least congested route under c (i.e. shortest path with respect to c) 

5 c(e) <— c(e)(l + fJ-/w), Ve £ p 



Figure 1: Procedure to find routes for packets injected during one phase. 

Protocol. We route every packet injected along the path whose total congestion is the smallest 
under the current congestion function c(-), i.e. we route along shortest paths with respect to c(-). 
Initially, the congestion along every link is set to S where S is defined in (|2|). For every link e along 
the chosen route, its congestion c(e) is updated to c(e)(l + ji/w) where [i is defined in (|l]). We 
reset the congestion of every link to its initial value of S at the beginning of each phase. A phase 
terminates in t windows of w steps, where t is an integer defined in (||). Figure [l| illustrates the 
procedure for one phase. The values of /j,, 5 and t are defined as follows. Let m be the number of 
links in the network. For any R £ (r, 1) of our choice, let 

/' 

6 
t 

Our objective is to show, 

Theorem 2 For all packets injected during one phase, at most twR of their routes chosen by our 
procedure go through the same link. In other words these routes are weakly (tw,R)- admissible. 

In fact, as will be seen later, we only need to assume that the adversary can choose fractional paths that are 
weakly (w, r)-admissible. 



r \ V 3 

RJ 



1 — r[i 

m 
1 — r/i 



l/r/i 



hi 



1 — r/i 
m5 



+ 1 



(1) 
(2) 
(3) 
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Analysis. To prove Theorem || let us examine an integer program formulation for routing the set 
of packets injected during a window of w time steps. Let Pj be the set of possible routes for the 
jth packet, and let variable Xj(p) £ {0, 1} indicate whether or not route p G Pj is chosen for packet 
j. The following linear relaxation of the integer program (LP) has an optimal solution A > 1 since 
the injections are (w, r)-admissible. We present both the primal and the dual. 

Primal 

max A 

s.t. 

E P eP 3 x j(p)> x v i 

Xj{p) > Vj,VpGPj 

Dual 

min J2 e rw ' c ( e ) 

s.t. 

E eep c(e) >z(j) VjypePj 

Ej z(j) > i 

c(e) > Ve 
z(j) > Vj 

For any non-negative congestion function c(-), let D = J2 e c ( e ) be the total congestion of all links. 
For packet j let qj be the least congested path in terms of c. We use a = J2j Eeegj c ( e ) t° represent 
the total congestion of these least congested paths. It can be shown that the dual is equivalent to, 

min rw ■ D/a. 

c 

The congestion found at the end of window i by our protocol (see Figure |IJ) defines a valid solution 
to this reformulated dual for window i. We exploit this connection to prove Theorem The key 
here is to bound the total link congestion since the link congestion increases only when a path goes 
through it. In particular, the following three lemmas show that the total link congestion is no more 
than 1 at the end of a phase. Let Cj(e), Di and a, represent the values of c(e), D and a at the end 
of the ith window. 

Lemma 3 Di/on > 1/rw for 1 < i < t. 

Proof: Since the injections are (w, r)-admissible, the primal LP for window i has max A > 1. Since 
the congestion q found by our protocol defines a dual solution, our lemma follows from duality. ■ 

Lemma 4 Da < 1 



1— rfi ' 

Proof: It suffices to show A < A-i + cq ■ fi/w since Di/on > 1/rw by Lemma [|. Let Cij be 
the congestion function after routing the jth. packet injected during the ith window and let Dij 
be defined in terms of dj. Suppose path pj is chosen for the jth packet injected during the ith 
window. By definition we have, 

D^ = 5Z c *i( e ) 

e 

= c M-i( e ) + c »j-i( e )( 1 + /V w ) 

= A,i-i + c «j-i( e ) ■ vl w - 
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Now we repeatedly apply the recurrence above. We also observe that the congestion function c only 
increases. Hence, if qj is the least congested path for j under a then J2eep c i,j~i( e ) is necessarily 
no more than ^2 e€q . Cj(e). (We emphasize that pj and qj may be two different paths. The path pj 
is least congested with respect to cy-i and qj is least congested with respect to c«.) We have, 

A = A-i + Y Cij-i(e)fi/w 

< Di^i + ai-fi/w. 



Lemma 5 Dt < 1. 

Proof: By definition Dq = mS where m is the number of links in the network. By applying 
Lemma ||, we have, 



A < 



m5 



(1 - 77*)* 

m<5 / ra x ' ~ 1 
'1 + 



1 — rpL \ 1 — r/i 

< e 1_r ' j 

1 — r/i 

< 1. 

The second inequality follows from 1 + x < e x for x > 0. The last inequality follows from the 
definition of t in @. ■ 

We are now ready to prove Theorem [|. 
Proof of Theorem ^: Consider any link e. For every w paths routed though e, the congestion of 
e is increased by a factor at least 1 + \i. Initially, co(e) = 5. Since D t < 1, ct(e) < 1. Hence, the 
total number of paths that are routed through e in a phase is at most u>log 1+M 1/5. It suffices to 
show that this quantity is no more than wtR. 

w logi +M 1/6 ^ In 1/5 rjx 1 J_ 

wtR ~ ln(l + n) ' 1 - Til ' In l^L ' R 

r fj, 



R ln(l +/i)(l - r/j) 2 
= 1. 

The first inequality and the first equality follow from the definitions of t and 5 respectively. 
The second inequality follows from the fact that r < 1 and ln(l + //)>// — /j 2 /2. The last equality 
follows from the definition of [i. Our proof is complete. ■ 
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Find routes. 

1 Initialize c(e) = 5, Ve 

2 for ith window, i = 1, . . . , t 

3 for each packet injected during ith window 

4 p <— least congested route under c 

5 c(e) «-c(e)(l + JVi(e) -/x/«0- 



Figure 2: Procedure to find routes for packets injected during one phase with fewer updates. 



2.2 Routing with Less Frequent Updates 

In this section we show that Theorem ^ still holds even if the congestion function c is updated less 
frequently. In particular, we only update the congestion at the end of each window, not for each 
packet injection. Hence the source nodes only need to communicate with the links at the end of 
each window. For this new protocol we redefine p to be 



m \ \R 

Suppose iVj(e) packets are routed through link e during the ith window, then we update c(e) to 
c(e)(l + JVi(e) -n/w). See Figure §. 

We prove that Theorem [2] remains true. We first show that Lemma Q still holds. As before, we 
show Di < Di-i + ctj • ///to. For any packet jf injected during the ith window, let pj be the path 
chosen for j. 

Di = ^Cj(e) 

e 

= ^c i _ 1 (e)(l + iV i (e) -/i/w) 

e 

= A-i +^Ci_i(e)7Vj(e) • /i/w 

e 

< Di-i + an ■ n/w 

Hence Df < 1. Now, for every paths routed through e, the congestion on e is increased by a 
factor at least 1 + mp,. Therefore the congestion on any link at the end of a phase is at most, 

mw log 1+mfl l/5 In 1/5 rp, 1 1 

wtR ~ ln(l + mp) 1 - rp ' In i=ffi ' i? 

R ln(l + m//)(l — rp) 2 
= 1, 



with the revised definition of p in 
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2.3 Implementation Using In-band Signaling 

In the previous sections we assumed that sources can communicate with the links on their chosen 
routes via instantaneous setup messages. In turn, we also assumed that the links can instanta- 
neously broadcast their congestion to the sources. In this section, we first extend our result in 



Section 2.2 to the case where each of these communications takes r time steps. We then give 
an upper bound on r for which the communication may be carried out in-band using packets 
transmitted through the network. 

Assume without loss of generality that w > 2t (since admissibility for a small window implies 
admissibility for a large window). Each source only updates the link congestion at the end of 
every window. Since the congestion does not change during a window, all the packets for a given 
source-destination pair (s, t) are routed along the same path p. At the end of window [w(i — 1), wi) 
a control packet is sent along path p that contains the number of (s, t)-packets injected during 
window [w(i — l),wi). This packet takes time r to traverse the path. Hence, at time wi + r, each 
link can update its congestion due to all the packets injected during [w(i — l),wi). Then by time 
wi + 2t < w(i + 1) this new congestion can be distributed via control packets to all the sources. 

Note that at the end of window [wi, w(i + 1)), every link has updated its congestion according 
to the injections in window [w(i — l),wi). The exact form of this update is as follows. Let iVj(e) 
be the number of packets routed through e that were injected during [w(i — l),wi). Let Cj(e) be 
the congestion of e at the end of window [w(i — l),wi). We update Cj(e) by, 

Cj + i(e) = Cj(e) + c l -_i(e)iV i (e) • ji/w, 

for 




r 

RJ 



n 



To show that Theorem || remains true, we observe, 

A+i = H c *+i( e ) 

e 

= ^Cj(e) + Cj_iiVj(e) • fj,/ 

e 

= Di + Y,<Hr-i(e)Ni(e)-fJt/ 



w 



w 



= + a^ie) ■ n/w 

< Di + ati t i+i ■ V>/w- 

Here a^+i is the sum of the congestion along the paths chosen for packets injected during [w(i — 
l),wi) with respect to c«+i(e). This is sufficient to imply Dt < 1. Note also that for every 2mw 
(non-control) packets routed through a link, the congestion function of the link increases by at 
least a factor 1 + 2mfi. The remainder of the analysis follows through for the revised definition of 
/' i» ©• 

To ensure that the transmission time of the control packets is upper bounded, the scheduling 
protocol always gives priority to control packets. Observe that a total of at most n 2 + ran control 
packets can be sent out during one window, where m is the number of links and n is the number of 
nodes in the network. If we let r = n 3 + mn 2 , the transmission of a control packet takes at most r 
time steps. Without loss of generality we assume that w > 2r and w(l — r)/2 > n 2 + mn. The latter 
condition ensures that together with the control packets the injections are (w, (l + r)/2)-admissible. 
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3 A Scheduling Protocol with Polynomial Delay Bounds 



In this section we assume that (w, r)-admissible paths are known (either given by the adversary 
or computed as in Section ^). Hence, in order to achieve network stability we can use any of 
the scheduling protocols that are known to be stable for Adversarial Queueing. However, the 
best previous delay bounds known for distributed, deterministic protocols are exponential in the 
maximum packet path length. In this section we present a deterministic, distributed scheduling 
protocol with a polynomial delay bound. 

In |] a randomized protocol was presented for which the delay bound is 0{ dm ™ log m) with 
high probability, where e = 1 — r and d max is the length of the longest simple path in the network. 
This protocol is hard to derandomize because its success depends on a condition that can only be 
checked globally. In this section we first present a new randomized protocol and then show how to 
derandomize it in a distributed manner. The key idea of this protocol is that the conditions that 
determine the "success" of the protocol only depend on packets that share the same initial link. 
This allows derandomization in a distributed manner. 

Our new randomized protocol is defined in terms of two parameters M and T which are defined 
below. We partition time into intervals of length M, which we call M -intervals. We save up all 
packets that are injected into the network during each M-interval and then schedule these packets 
during the next M-interval. We give each packet a deadline for every link on its path. Our goal is 
to make sure that no more than T packets have a deadline for link e during any time interval of 
length T. If this condition holds then we are able to bound the end-to-end delay experienced by a 
packet. 



Randomized protocol. For a packet p injected during an M-interval [(7 — 1)M, jM) for an 
integral 7, let us suppose its path is eo,ei, . . . , e^. We define a deadline r| for p at link as 
follows. We choose the initial deadline Tq uniformly at random from [7M + T, (7 + 1)M — d m3jX T). 
We then define the remaining deadlines inductively by t% +1 = + T. Our protocol always gives 
priority to the packet with the smallest deadline at each link. We define M and T such that, 

T = ^l g(2Mm 2 ), (6) 

M > max 1 1 ~ 2 (d max + 1)T, w j . (7) 

These properties are satisfied for, 

M = o{-^—\og- + w\ . 
When a packet meets its deadlines, it reaches its destination within 2M steps. 



Analysis. Our objective is to show that all packets injected during a given M-interval meet all 
their deadlines with a constant probability. Lemma ^ gives a sufficient condition for all deadlines 
to be met. For any packet p and link e let X^ e t+T ^ = 1 if e is the fcth link on packet p's path and 
rjP lies in the time interval [t,t + T). Let X^ e t+T ^ = otherwise. 

Lemma 6 // J2 P ^ftt+T) — ^ f or a ^ * an( ^ a ^ ^ n ^ s e > then all packets meet all their deadlines. 
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Proof: Suppose not. Let p be a packet that misses its kth deadline and suppose that no 
deadline earlier than t£ is missed. Then p has arrived at its feth link et by time — T. (This is 
true regardless of whether is the initial link of p or not.) By our assumption that is the first 
deadline that is missed, all the packets with deadlines for e& that are earlier than — T + 1 meet 
those deadlines. Therefore, the only packets that block packet p in the interval [rjP — T + 1, t£] have 



deadlines in the interval \t% 



T + 1, t?]. By the assumption in the statement of the lemma there 



are at most T — 1 such packets (excluding p) . Therefore packet p is served by link at time or 
earlier. This is a contradiction. ■ 

Given Lemma |6| we show, 

Lemma 7 Consider packets injected during an M -interval, [(7 — 1)M, 7M). The number of dead- 
lines from these packets on any link e during any interval [t, t + T) is at most T with a constant 
probability. 

Proof: We use a Chernoff bound to prove the number of deadlines is small. Let S2 e be the set of 
packets injected into the network during the interval [(7 — 1)M, 7M) that have eo as their initial 
link and that have link e on their path. The expected number of deadlines is, 



E 



pes2 , e 



< 



IS 7 I 



M - (d max + 1)T 



T. 



When |57 0ie | is large, the expectation is large and the argument is straightforward. However, for 
small \S2 e | a direct application of the Chernoff bound may not suffice. To rectify this, let us define 
a new quantity, 

The quantity has the following properties. 

1. (3l je > e/3m; 

2- E en /3J n , e < 



M- 



w ((l - e) + me/3m) < ^^(1 - 2e/3) < 1 - e/2. 



The second property follows from the requirement of M in ([?]) and the admissibility of the paths. 
Our lemma follows if we show that the following holds with constant probability, 



X [Ct+T) ^ (! + £/2)/?2 , e T, Ve , e and V[t, t + T). 



(8) 



If the above holds, the number of deadlines on link e in the interval [t,t + T) is at most (1 + 
e /2) J2e Plne^i which is less than T due to the second property of /?. We have, 



Pr 



pes: 



< 



n p g[(i+£/2) x [M + T)] 

(l+£/2)( 1+£ / 2 ^o- T 



< exp(-e 2 /? 7 T/12) 



< 



1 

2Mm 2 ' 



(9) 
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The first inequality is due to a Chernoff bound. The second inequality holds since E[J2 p( =s2 e ^[tt+T)\ 
— P2 an d 1 + x — eX f° r x > 0. The third inequality follows from the definition of T in (|6j) 
and the fact that (5J e > e/3m. By taking a union bound over all links cq, e and all intervals 
[i, i + T) C [jM, (7 + 1)M), we have that the number of deadlines from all packets on e during 
[t,t + T) is at most T with probability at least 1/2. ■ 

Remarks. To prove Lemma a condition weaker than (N) would be sufficient. It would suffice 
to show that the number of deadlines on any e during any [t, t + T) is at most (1 + e/2) 2~^ eo P2 ,e^- 
Indeed, this would even allow T and M to be a factor of m smaller, as in Q. However, such a 
weaker condition only allows derandomization in a centralized manner. 

We emphasize that the condition (|8|) depends only on sets of packets that are injected into one 
particular initial link. Therefore we can choose the deadlines for a packet simply by considering the 
other packets that are injected at the same initial link. Hence, we can carry out a derandomization 
independently at each initial link and obtain a distributed, deterministic protocol. This is in contrast 
to the randomized protocol of @ in which the success condition depends on packets that are injected 
across all initial links in the network. 



Derandomization. We use the method of conditional expectations to derandomize the protocol 
for each M-interval. (See e.g. pj.) In summary, 



Theorem 8 Our derandomized protocol is distributed and guarantees a delay bound of 2M 
poly(m,w, 1/e) for every packet. 



Proof: Let 



{Po,Pi, ■ 



i\. For i < £, let g(5o,5i, . . . , S{) be equal to 



2> 



E X l^ + T) X 1 + e/2)/3 e 7 , e T|r™ = So, . 



Si 



where t is summed over the range [7M, (7 + 1)M — T). By a calculation similar to the Chernoff 
calculation of @, the value of g(-, ...,•) is upper bounded by the following function h, 



h(8 ,5i,. . . ,S{) 



E 

e,t 



n p eMIE[X^ t+T) \rE =So, 

(l + e/2)( 1+£ / 2 ^V T 



' r 



For fixed 5q, . . . , Si-i, the definition of conditional expectation implies that there exists an initial 
deadline 5i for the packet pi such that h(5o, 6\, . . . , > h(5o, Si,... , <5j_i, 5i). If we always choose 
the initial deadline so that this inequality is satisfied then, 



g(S ,5i, . . . 



< h(S ,Si, ... ,Si)) 

< h{Q)) 

< exp(- £ 2 /37 0ie T/12), 



The third inequality follows from (|9|). We have chosen the parameters M and T so that exp(— e /3Z, e ^/12) 
is less than 1. In addition, since g(5o, Si, ... ,Si) involves no randomness every term of g is either 
or 1. The above inequalities imply that g(5o, 5\, . . . ,5i) is less than 1 and so condition (0) fails 
with probability zero. Hence, with probability one all deadlines are met and all packets reach their 
destinations in time 2M. 
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eo 




Figure 3: Network G for which FIFO and NTG are unstable even if we are allowed to choose routes. 

It remains to show that we can calculate h(5o, . . . , <5j). If j < i then, 

E i X [tf+T)\ T 0° = S 0, ■ ■ ■ , Tq" = Si] 

is equal to or 1 depending on whether or not the initial deadline Sj causes packet pj to have a 
deadline for link e during [t, t + T). If j > i then, 

E i X [tf+T)\ T o° = 5 o, • • • , Tq 1 = Si] = E[X^f +T) }, 

which is equal to the probability, over all possible choices of the initial deadline, that packet pj 
has a deadline for link e during the interval [t, t + T). (Recall that the initial deadline has at 
most M choices and all subsequent deadlines are chosen deterministically.) This probability is 
solely dependent on whether or not the path for packet pj passes through link e. Hence, for fixed 
5o, . . . , 5i-i we can choose the value of 5i that minimizes h(5o, 6%, . . . , Si-%, 5i). ■ 

4 Instability in Combined Routing and Scheduling 

In |Q] it was shown that if the packet routes are given by the adversary then the FIFO and Nearest- 
to-Go (NTG) scheduling protocols can be unstable even if the packet paths are admissible. (FIFO 
always gives priority to the packet that arrived at the link earliest. NTG always gives priority 
to the packet that has the smallest number of hops remaining to its destination.) However, the 
examples given in [Q] do not lead to instability if we are allowed to route packets on paths other 
than the ones chosen by the adversary. 

We therefore have a natural question. If we are allowed to choose the routes, can we guarantee 
that FIFO and NTG are stable? In this section we show that the answer to this question is 
negative. We present examples in which regardless of how we choose the routes, the FIFO and 
NTG scheduling protocols create instability. 

Theorem 9 There exists a network G such that FIFO creates instability under some (w,r)- 
admissible injections regardless of how packets are routed. 

Proof: Network G is shown in Figure |3|. We break the packet injections into phases. We inductively 
assume that at the beginning of phase j a set S of s packets with destination uq is in the queue of 
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eo- We show that at the beginning of phase j + 1 more than s packets with destination u\ are in 
the queue of e%. By symmetry this process repeats indefinitely and the number of packets in the 
network grows without bound. For the basis of the induction, we inject a large burst of packets at 
source node vo with destination node uq, which is allowed by a large window w. From now on all 
the injections are at rate r with burst size one. In general the sequence of injections in phase j is 
as follows. 

(1) For the first s steps, we inject a set X of rs packets at node vq with destination u\. These 
packets are completely held up at eo by the packets in S. We also hold up packets in S at 
/o by injecting rs packets at wq with destination uq. These newly injected packets get mixed 
with those of S into the set S'. At the end of the first s steps, rs packets from S' are at /o- 
Note that packets in X will be routed through either /o or /q. 

(2) For the next rs steps, we inject a set Y of r 2 s packets at node i>o with destination u\. These 
packets are held up at eo by the packets in X. We also inject packets at wq with destination 
u' Q at rate r. These packets delay the packets from X that are routed through /q. Hence, at 
most rs/(r + 1) packets of X cross /q. (This only happens if packets in X are routed through 
/g, which is not necessarily the case.) Note that no packet from X crosses /o in these steps, 
since the packets in S' have priority. Hence, at the end of these rs steps, a set X' C X of at 
least r 2 s/(r + 1) packets are still at u>o- 

(3) For the next \X'\ + \Y\ steps the packets in X' and Y move forward, and merge at v\. 
Meanwhile, we inject packets at v\ with destination u\ at rate r. We end with at least 
r(|X'| + \Y\) packets at v\ with destination u\. This number is at least r 3 s + r 3 s/(r + 1). 

This ends phase j. For r > 0.9 we have r 3 + r 3 /(r + 1) > 1. It is easy to verify that the injections 
during phase j are admissible. The inductive step is complete. ■ 

Injections similar to the above can be used to prove the instability of NTG on network G at 
any rate r > 1/V~2. The induction hypothesis of phase j now does not require the packets in S to 
be initially in the queue of eo, but to cross eo in the first s steps of the phase. Hence, subphase 
(3) is no longer required. Furthermore, after subphase (2) both sets Y and X' contain at least r 2 s 
packets, since single- link injections have higher priority than the packets in X. It follows that the 
system is unstable since 2r 2 s > s. 

5 Stability of a Ring with Parallel Links 

In this section we consider source routing on a ring with c parallel links. Consider a decomposition of 
the network into c disjoint single rings. We propose a deterministic on-line source-routing algorithm 
that routes each packet along one of these rings and guarantees that the routing is admissible. In 
0] it was shown that the single ring is stable under any greedy scheduling policy (i.e. one that 
always schedules a packet whenever packets are waiting). Hence, we conclude that the ring with c 
parallel links is stable under any greedy scheduling policy if our source-routing algorithm is used. 

Note that the 4-ring with 2 parallel links was shown to be unstable under a greedy protocol 
such as FIFO when the packet paths are given by the adversary 0. This shows that freedom of 
routing can make a difference in network stability since we have a network that is unstable under 
FIFO if the adversary can dictate the routes but is stable under FIFO if we can choose the routes 
intelligently. 
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5.1 Definitions 



Consider a ring with n nodes and c parallel directed links from node i to node i + 1( mod n). The 
parallel links connecting neighboring nodes are uniquely labeled 1, . . . , c. We denote the cycle of 
n links labeled j as the jth single ring. Note that, if j ^ j', the jth and the j'th single rings are 
link disjoint. We assume that the injections are (w, r)-admissible. For convenience we sometimes 
denote 1 — r by e. We propose a source-routing algorithm that finds weakly (W, i?)-admissible 
paths along these single rings, where, 



W 
R 



for some j3 < 1. 



3 nc 
re z p 



(10) 
(11) 



5.2 Randomized Algorithm 

Let us first study the following randomized routing algorithm. Each time a packet is injected, one 
of the c single rings is randomly chosen, uniform and independently, and the packet is routed along 
it. Since the injections are (w, r)-admissible, in any VF-interval at most crW packets are injected 
that must cross the parallel links from any node i to i + 1( mod n). Hence, the expected number 
of packets routed along any link of the ring is at most rW . Using a Chernoff bound we can upper 
bound the probability of more than (1 + e)rW = RW packets being routed along any link in the 
VF-interval. Let P = po,p%, . ■ ■ ,pe be the set of packets injected in a iy-interval. For each packet 
Pj, let Xe j be the random variable denoting whether pj is routed along link e. Let X e be the 
number of packets routed along link e in the VF-interval. From a Chernoff bound we have that, 



Pr[X e > (1 + e)rW] < 



n 



Pj eP 



E\(l + e 



(1 +e)(l+e)rW 

< [e £ /(l+e)^] rW 

e (e-(l+e)ln(l+e))rW 
,-e 2 /3yW 

JL 

nc 



< 
< 

< 



The last two inequalities follow from the fact that e < 1 and the definition of W in (|l0|), respectively. 
We can now bound the probability of any link having more than (1 — e 2 )W packets routed along 
it. We use E to denote the set of links in the ring. 

Pr[maxX e > (1 + e)rW] < Pr \ x e > (1 + e)rW\ 



nc 



P 



Hence, since j3 < 1, there is a positive probability of routing all the packets in such a way that 
no link has congestion more than RW. By choosing a very small [3 (e.g., 0(l/n)) we could 
show that this randomized algorithm guarantees that the routing is weakly (W, i?)-admissible with 
high probability. This can be used to show the stability of any greedy scheduling protocol in a 



15 



probabilistic sense (i.e., there is a value C such that the probability of having more than kC packets 
in the system at any given time is exponentially small in k). 

However, in the rest of the section we only need (3 < 1. We will derandomize the proposed 
algorithm, and all we need for this process to work is to have a feasible routing with the required 
properties. This is guaranteed for any (3 < 1. 



5.3 Off-line Routing 

We will now derandomize the above algorithm so that all the packets are deterministically routed 
and no link has congestion more than (1 - e 2 )W. To do this, we use the method of conditional 
probabilities, as we did in Section |3[ Unfortunately, to apply this method directly we need to know 
from the beginning the set P of packets to be routed. We achieve this as follows. We divide time 
into intervals of W steps, and hold all the packets injected in one TV-interval until its last step. 
Then, all these packets are routed in that last step, when all of them are known. 

Let P = po,pi, ■ ■ ■ ,Pi be the set of packets injected in a W-interval. Let r y Pj denote the single 
ring chosen to route packet pj. For i < £ let, 

g{5 ,8i, ... ,5i) = Pr[maxX e > (1 + e)rW\j Po = 5 ,... ,7 P< = Si]. 
Since g (■,... ,■) is difficult to calculate directly, we define another function h(-, . . . , •) by, 
. x . ^Tl Pj eP E [( 1 + £) XPeJ hpo = So,---,'Y Pi =8i] 

h(d , di, . . . , di) - 2^ (l +£r )(l+ £ )rW ' 



which can be easily computed. For this, it is enough so observe that, when computing h(So, Si,... ,8- 
for each packet pj, 



if J < i, then 

— if e is in the Sjih single ring and it is in the path from the source to the destination of 
Pj, then E[(l + e) x ^ |7 Po = S , . . . , j Pi = Si] = 1 + e. 



- Otherwise, E[(l + e) x - | 7po = So, . . . , 7 Pi = Si] = (1 + e)° = 1. 
• if j > i, then 

- if e could be in the path from the source to the destination of pj, then E[(l + e) e |7 P0 = 
5o,...,j Pi = 8i] = (l + ey/ c . 

- Otherwise, E[(l + e) x ^ | 7po = 8 Q , . . . ,7 Pi = Si] = (1 + e)° = 1. 

We have that, g(8o, 8i,...,5i) < h(8o, 8i,..., Si). Also, for fixed So, ■ ■ ■ , ^i-i, the definition of 
conditional expectation implies that the single ring Si can be chosen such that h(So, Si,. . . , <$j_i) > 
h(5o, Si,... , Si-i,Si). If we always choose the single rings so that this inequality is satisfied then, 

g{8 ,8i,...,S e )<h(8 ,5i,...,S e ))<h(®)<!3. 

In this expression, the left-hand-side involves no randomness and so it is either or 1. However, 
since j3 < 1, it has to be less than 1 and so there must be a probability zero of failure. Hence, no 
link has congestion more than (1 — e 2 )W, and the routing is weakly (W, i?)-admissible. 



16 



5.4 On-line Routing 

Now we want to route packets as soon as they are injected. This does not allow us to directly 
use the above derandomization process, since we will not necessarily know the set P by the time 
we need to route the first packets. This is needed to compute the different values of the function 
h(-, ...,•). However, we will deal with this problem by making pessimistic assumptions about the 
packets that have not been injected yet. 

First consider two packets, pk and pi, such that their paths do not overlap, and the destination 
node of pk is the source node of pi . Replace these packets by one single packet whose source node is 
that of pk and its destination node is that of p\. Observe that, for fixed <5o, • • • , Si, if k > i and I > i, 
the value of h(So, . . . , Si) does not change by the replacement (see above). This can be generalized 
to the replacement of any number of packets. 

Then, this allows us to use the following trick. Initially we assume a set P^ of packets that 
consists of crW ghost packets going from node i to node i + 1( mod n), for each i. The value /i(0) 
is computed for this set P^ . 

Now, assume that i — 1 packets have been already injected and routed. (That is, the values 
Sq, Si, ... , Si-i are fixed and h(So, Si, ... , Si-\) is computed.) When the ith packet pi is injected, 
we remove one ghost packet from the set p(* _1 ) for each hop that pi crosses. These ghost packets 
are replaced by the packet pi to obtain a new set pW. The existence of the appropriate ghost 
packets is guaranteed by the initial ghost packets we put in P^ ) and the fact that the injections 
are (w, r)-admissible. As we saw previously, this does not change the value of h(So,Si, . . . , <5j_i). 
Then, route the packet pi (choose and fix Si) so that h(So, Si, ... , <5j_i) > h(So, Si, . . . , <$j_i, Si). 

By repeating this process, at the end of the PF-interval we have that 

g(S , Si,...,S £ )< h(S , S 1 ,...,S e ))< h(9) < (3, 

where £ is the number of packets injected during the VF-interval. We now remove all the remaining 
ghost packets. This process eliminates any remaining randomness in g(5o, Si, . . . , Si), and can 
never increase its value, since it only removes packets. Then, since g(So, S±, . . . , Sp) = involves 
no randomness and (3 < 1, g(So, Si, . . . , Se) = and no link has congestion more than (1 — e 2 )W. 
Hence, the routing is weakly (W, P)-admissible. 

6 Conclusions 

In this paper we have presented source routing algorithms for packet-switched networks and we 
have described the first distributed, deterministic scheduling protocol with a polynomial delay 
bound. There is much still to be explored in the study of combined routing and scheduling. For 
example, different packets are often associated with different delay requirements. Some of them 
may be delay-sensitive whereas others may be delay-tolerant. The problem of scheduling these 
packets on given routes in order to meet these delay requirements has been studied before. The 
ability to choose the routes would add an additional dimension to the problem and may even make 
scheduling easier. 
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